
In order to test our system, we trained an instructor on the English part of the GIVE-2
corpus~\cite{GarGarKolStr10}. The GIVE-2 corpus was developed to let the GIVE-2 Challenge participants train their systems. It consists of 63 interactions and is manually annotated with the semantics of its referring expressions. For our experiment we do not use the manual annotations but only the instructions annotated with their reaction according to the algorithm described in Section~\ref{sec:annotation}. A fragment of the annotated corpus is shown in Appendix~\ref{corpus-fragment}.

\begin{figure}
\centering
\begin{tabular}{cc}
\subfloat[red closest to the chair in front of you]{\label{fig:seq1}\includegraphics[scale=0.22]{images/map1-screenshot.png}} &
\subfloat[the closet one]{\label{fig:seq2}\includegraphics[scale=0.22]{images/map2-screenshot.png}} \\
\subfloat[good]{\label{fig:seq3}\includegraphics[scale=0.22]{images/map2-1-screenshot.png}} &
\subfloat[go back to the room with the lamp]{\label{fig:seq4}\includegraphics[scale=0.22]{images/map3-screenshot.png}}
\end{tabular}
\caption{Example interaction: pushing a button and leaving the room}
\label{example-interaction}
\end{figure}

The Fig.~\ref{example-interaction} shows a fragment of an interaction between the instructor obtained and a human IF.\footnote{A video of a successful task completed while interacting with the systems be watched at \url{http://cs.famaf.unc.edu.ar/~luciana/give-OUR}} The figure displays a bird eye map of the virtual world and the 3D first person view of the user. In Fig.~\ref{fig:seq1}, the IF, represented by a blue character, has just entered the upper left
room and he receives the instruction ``red closest to the chair in front of you''.
The referring expression uniquely
identifies the target object using the spatial proximity of the target to the
chair---``red closest to the chair''---and the relative position of the player with respect to the target---``in front of you''. 
This instruction is selected by our algorithm considering the current state of the task plan, the IF
position and orientation, and the fact that this expression made the IF in the corpus manipulate the intended target. 

After receiving the instruction, the IF gets closer to the button as
shown in Fig.~\ref{fig:seq2}.  As a result of the new IF position, a new task
plan exists, the set of candidate utterances is recalculated and the system
selects a new utterance, namely ``the closet one''. The generation of the
ellipsis of the nouns button and chair is a consequence of the utterances
normally said in the corpus at this stage of the task plan that is, when the IF
is about to manipulate the target and hence has probably received a complete referring expression before. 
The instruction contains a spelling error (``closet'' instead of ``closest'') given that the surface form of the instructions found in the corpus is used for generation.

Right after the IF clicks on the
button (Fig.~\ref{fig:seq3}), the task plan changes because the button causes a door to open. As a result, an even though the player state did not change, the system selects a new set of candidate utterances corresponding to
the new task plan. The next step in the plan is to get away from the button. The utterances in the corpus that have this reaction are acknoledgements and this is what the automated IG learns to say uttering ``good'' at this point. 

After receiving the acknowledgement, the user turns
around and walks forward, and the next action in the plan is to leave the room
(Fig.~\ref{fig:seq4}). The system selects the utterance ``go back to the room
with the lamp'' which refers to the previous interaction. The system keeps
no explicit representation of the past actions of the user, but such utterances are the
ones that are found at this stage of the task plan when the IF must have already visited the room with the lamp. 

\begin{figure*}
%\begin{multicols}{2}
\begin{small}
\begin{tabular}{l}
go back to the room with the lamp\\
okay now go back to the original room\\
ok go back again to the room with the lamp\\
now i ned u to go back to the original room\\
nowin to the shade room\\
down the passage\\
go back to the hallway\\
Go through the opening on the left with the yellow wall paper\\
okay now go back to where you came from\\
now go back\\
go back out of the room\\
out the way you came in\\
exit the way you entered\\
ok now go out the same door\\
go back to the door you came in\\
Go through the opening on the left\\
closest the door\\
go back out\\
now go back out\\
left\\
straight\\
L\\
go\\
yes\\
\end{tabular}
\end{small}
%\end{multicols}
\caption{All candidate utterances (i.e., pragmatic paraphrases) when exiting the room in Fig.~\ref{fig:seq4} ordered by reaction length. }
\label{all-utts}
\end{figure*}

The Fig.~\ref{all-utts} lists all the pragmatic paraphrases which are candidates to be selected in the state of Fig.\ref{fig:seq4}. As described in Section~\ref{selection}, the
utterance with the longest reaction is selected first---``go back to the room with
the lamp''---, later on, if the user does execute the expected reaction, utterances with a shorter reaction are said---``go back to the hallway''.  The candidate paraphrases differ not only on the reaction lenght but also on their realization style, from telegraphic style like ``L'' (which GIVE players usually ground to mean ``left'') to
full sentences like ``Go through the opening on the left with the yellow wall
paper''. They also differ on their speech act, from acknowledgements such as ``yes'', direct imperatives such as ``go back out'', to indirect requests such as ``now i ned u to go back to the original room''. All of the instructions, even if quite different among
themselves, could have been successfully used to exit the room. This is because, all of them have the same reaction and that makes them pragmatic paraphrases.  
